Deriving a Domain Specific Test Collection from a Query Log
نویسندگان
چکیده
Cultural heritage, and other special domains, pose a particular problem for information retrieval: evaluation requires a dedicated test collection that takes the particular documents and information requests into account, but building such a test collection requires substantial human effort. This paper investigates methods of generating a document retrieval test collection from a search engine’s transaction log, based on submitted queries and user-click data. We test our methods on a museum’s search log file, and compare the quality of the generated test collections against a collection with manually generated and judged known-item topics. Our main findings are the following. First, the test collection derived from a transaction log corresponds well to the actual search experience of real users. Second, the ranking of systems based on the derived judgments corresponds well to the ranking based on the manual topics. Third, deriving pseudo-relevance judgments from a transaction log file is an attractive option in domains where dedicated test collections are not readily available.
منابع مشابه
Analysis of User query refinement behavior based on semantic features: user log analysis of Ganj database (IranDoc)
Background and Aim: Information systems cannot be well designed or developed without a clear understanding of needs of users, manner of their information seeking and evaluating. This research has been designed to analyze the Ganj (Iranian research institute of science and technology database) users’ query refinement behaviors via log analysis. Methods: The method of this research is log anal...
متن کاملQuery Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملA Search Log-Based Approach to Evaluation
Anyone offering content in a digital library is naturally interested in assessing its performance: how well does my system meet the users’ information needs? Standard evaluation benchmarks have been developed in information retrieval that can be used to test retrieval effectiveness. However, these generic benchmarks focus on a single document genre, language, media-type, and searcher stereotype...
متن کاملDiscovering Popular Clicks\' Pattern of Teen Users for Query Recommendation
Search engines are still the most important gates for information search in internet. In this regard, providing the best response in the shortest time possible to the user's request is still desired. Normally, search engines are designed for adults and few policies have been employed considering teen users. Teen users are more biased in clicking the results list than are adult users. This leads...
متن کاملDeriving query suggestions for site search
Modern search engines have been moving away from very simplistic interfaces that aimed at satisfying a user’s need with a single-shot query. Interactive features such as query suggestions and faceted search are now integral parts of Web search engines. Generating good query modification suggestions or alternative queries to assist a searcher remains however a challenging issue. Query log analys...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007